Reconstructing Posterior Distributions of a Species Phylogeny Using Estimated Gene Tree Distributions
نویسندگان
چکیده
The desire to infer the evolutionary history of a group of species should be more viable now that a considerable amount of multilocus molecular data is available. However, the current molecular phylogenetic paradigm still reconstructs gene trees to represent the species tree. Further, commonly used methods to combine data, such as the concatenation method, the consensus tree method, or the gene tree parsimony method may be biased. In this dissertation, I propose a Bayesian hierarchical model to estimate the phylogeny of a group of species using multiple estimated gene tree distributions such as those that arise in a Bayesian analysis of DNA sequence data. The model employs substitution models used in traditional phylogenetics, but also uses coalescent theory to explain genealogical signals from species trees to gene trees and from gene trees to sequence data, thereby forming a complete stochastic model to simultaneously estimate gene trees, species trees, ancestral population sizes, and species divergence times. The proposed model is founded on the assumption that gene trees, even of unlinked loci, are correlated due to being derived from a single species tree and therefore should be estimated jointly. The method is applied to three multilocus data sets of DNA sequences. The estimates of the species tree topology and divergence times appear to be robust to the prior of the population size, whereas the estimates of effective population sizes are sensitive to the prior used in the analysis. These analyses also suggest that the model is superior to the concatenation method in ii fitting these data sets and thus provides a more realistic assessment of the variability in the distribution of species trees that may have produced the molecular information at hand. Future improvements of our model and algorithm should include consideration of other factors that can cause discordance of gene trees and species trees, such as horizontal transfer or gene duplication.
منابع مشابه
Species trees from gene trees: reconstructing Bayesian posterior distributions of a species phylogeny using estimated gene tree distributions.
The desire to infer the evolutionary history of a group of species should be more viable now that a considerable amount of multilocus molecular data is available. However, the current molecular phylogenetic paradigm still reconstructs gene trees to represent the species tree. Further, commonly used methods of combining data, such as the concatenation method, are known to be inconsistent in some...
متن کاملSpecies Names in the PhyloCode: the approach adopted by the International Society for Phylogenetic Nomenclature.
Edwards, S. V., L. Liu, and D. K. Pearl. 2007. High resolution species trees without concatenation. Proc. Natl. Acad. Sci. USA 104:5936– 5941. Hennig, W. 1950. Grundzüge Einer Theorie der Phylogenetischen Systematik. Deutscher Zentralverlag, Berlin. [Published in English translation in 1966: Phylogenetic systematics, University of Illinois Press, Urbana, Illinois.] Hudson, R. R. 1983. Testing t...
متن کاملReconstruction of Evolutionary Trees from Pairwise Distributions on Current Species
Suppose that the evolution of a character possessed by a number of current species is modelled as a Markov random eld on an evolutionary tree. Suppose that for each pair of current species we know the joint probability distribution of the pair of characters possessed by that pair of species. We give conditions under which the evolutionary tree can be reconstructed from knowledge of these pairwi...
متن کاملFitting Tree Height Distributions in Natural Beech Forest Stands of Guilan (Case Study: Masal)
In this research, modeling tree height distributions of beech in natural forests of Masal that is located in Guilan province; was investigated. Inventory was carried out using systematic random sampling with network dimensions of 150×200 m and area sample plot of 0.1 ha. DBH and heights of 630 beech trees in 30 sample plots were measured. Beta, Gamma, Normal, Log-normal and Weibull prob...
متن کاملIs diversification rate related to climatic niche width?
Methods We characterized climatic niches for 5784 amphibian species using databases for species distributions and climate. We estimated the niche width of each family using the range of values for climatic variables across all sampled species, and using the mean of species niche widths. We estimated diversification rates for families given their total number of described species and a timecalib...
متن کامل